Heuristic Principal Component Analysis-Based Unsupervised Feature Extraction and Its Application to Bioinformatics

نویسندگان

  • Mitsuo Iwadate
  • Hideaki Umeyama
  • Yoshiki Murakami
چکیده

Feature Extraction (FE) is a difficult task when the number of features is much larger than the number of samples, although that is a typical situation when biological (big) data is analyzed. This is especially true when FE is stable, independent of the samples considered (stable FE), and is often required. However, the stability of FE has not been considered seriously. In this chapter, the authors demonstrate that Principal Component Analysis (PCA)-based unsupervised FE functions as stable FE. Three bioinformatics applications of PCA-based unsupervised FE–detection of aberrant DNA methylation associated with diseases, biomarker identification using circulating microRNA, and proteomic analysis of bacterial culturing processes–are discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature reduction of hyperspectral images: Discriminant analysis and the first principal component

When the number of training samples is limited, feature reduction plays an important role in classification of hyperspectral images. In this paper, we propose a supervised feature extraction method based on discriminant analysis (DA) which uses the first principal component (PC1) to weight the scatter matrices. The proposed method, called DA-PC1, copes with the small sample size problem and has...

متن کامل

Development of a cell formation heuristic by considering realistic data using principal component analysis and Taguchi’s method

Over the last four decades of research, numerous cell formation algorithms have been developed and tested, still this research remains of interest to this day. Appropriate manufacturing cells formation is the first step in designing a cellular manufacturing system. In cellular manufacturing, consideration to manufacturing flexibility and productionrelated data is vital for cell formation....

متن کامل

Comparative Analysis of Wavelet-based Feature Extraction for Intramuscular EMG Signal Decomposition

Background: Electromyographic (EMG) signal decomposition is the process by which an EMG signal is decomposed into its constituent motor unit potential trains (MUPTs). A major step in EMG decomposition is feature extraction in which each detected motor unit potential (MUP) is represented by a feature vector. As with any other pattern recognition system, feature extraction has a significant impac...

متن کامل

Improved Algorithm for Fully-automated Neural Spike Sorting based on Projection Pursuit and Gaussian Mixture Model

For the analysis of multiunit extracellular neural signals as multiple spike trains, neural spike sorting is essential. Existing algorithms for the spike sorting have been unsatisfactory when the signal-to-noise ratio (SNR) is low, especially for implementation of fully-automated systems. We present a novel method that shows satisfactory performance even under low SNR, and compare its performan...

متن کامل

An application of principal component analysis and logistic regression to facilitate production scheduling decision support system: an automotive industry case

Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be ineff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015